Pincer-Search: An Efficient Algorithm for Discovering the Maximum Frequent Set
نویسندگان
چکیده
Discovering frequent itemsets is a key problem in important data mining applications, such as the discovery of association rules, strong rules, episodes, and minimal keys. Typical algorithms for solving this problem operate in a bottom-up, breadth-first search direction. The computation starts from frequent 1-itemsets (the minimum length frequent itemsets) and continues until all maximal (length) frequent itemsets are found. During the execution, every frequent itemset is explicitly considered. Such algorithms perform well when all maximal frequent itemsets are short. However, performance drastically decreases when some of the maximal frequent itemsets are relatively long. We present a new algorithm which combines both the bottom-up and the top-down searches. The primary search direction is still bottom-up, but a restricted search is also conducted in the top-down direction. This search is used only for maintaining and updating a new data structure, the maximum frequent candidate set. It is used to prune early candidates that would normally encountered in the bottom-up search. A very important characteristic of the algorithm is that it does not require explicite examination of every frequent itemset. Therefore the algorithm performs well even when some maximal frequent itemsets are long. As its output, the algorithm produces the maximum frequent set, i.e., the set containing all maximal frequent itemsets, thus specifying immediately all frequent itemsets. We evaluate the performance of the algorithm using well-known synthetic benchmark databases and real-life census and ∗Applied Research, Telcordia Technologies, Inc., 445 South Street, Morristown, NJ 07960 +1 973 829 4740, [email protected]. †Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, 251 Mercer St., New York, NY 100121185, +1 212 998 3101, [email protected].
منابع مشابه
A comprehensive method for discovering the maximal frequent set
The association rule mining can be divided into two steps.The first step is to find out all frequent itemsets, whose occurrences are greater than or equal to the user-specified threshold.The second step is to generate reliable association rules based on all frequent itemsets found in the first step. Identifying all frequent itemsets in a large database dominates the overall performance in the a...
متن کاملDiscovering Maximal Frequent Item set using Association Array and Depth First Search Procedure with Effective Pruning Mechanisms
The first step of association rule mining is finding out all frequent itemsets. Generation of reliable association rules are based on all frequent itemsets found in the first step. Obtaining all frequent itemsets in a large database leads the overall performance in the association rule mining. In this paper, an efficient method for discovering the maximal frequent itemsets is proposed. This met...
متن کاملPincer-Search: A New Algorithm for Discovering the Maximum Frequent Set
Discovering frequent itemsets is a key problem in important data mining applications, such as the discovery of association rules, strong rules, episodes, and minimal keys. Typical algorithms for solving this problem operate in a bottom-up breadth-rst search direction. The computation starts from frequent 1-itemsets (minimal length frequent itemsets) and continues until all maximal (length) freq...
متن کاملAn Efficient Algorithm for Mining Multilevel Association Rule Based on Pincer Search
Discovering frequent itemset is a key difficulty in significant data mining applications, such as the discovery of association rules, strong rules, episodes, and minimal keys. The problem of developing models and algorithms for multilevel association mining poses for new challenges for mathematics and computer science. In this paper, we present a model of mining multilevel association rules whi...
متن کاملMax-Miner Algorithm Using Knowledge Discovery Process in Data Mining
Discovering frequent item sets is an important key problem in data mining applications, such as the discovery of association rules, strong rules, episodes, and minimal keys. Typical algorithms for solving this problem operate in a bottom-up, breadth-first search direction. The computation starts from frequent itemsets (the minimum length frequent itemsets) and continues until all maximal (lengt...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Knowl. Data Eng.
دوره 14 شماره
صفحات -
تاریخ انتشار 2002